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ABSTRACT 

States aeross the eountry are developing systems for evaluating sehool prineipals on the basis of 
student aehievement growth. A eommon approaeh is to hold prineipals aeeountable for the value 
added of their sehools — that is, sehools’ eontributions to student aehievement growth. In theory, 
sehool value added ean refleet not only prineipals’ effeetiveness, but also other sehool-speeifie 
influenees on student aehievement growth that are outside of prineipals’ eontrol. In this paper, 
we isolate prineipals’ effeets on student aehievement growth and examine the extent to whieh 
sehool value added eaptures the effeets that prineipals persistently demonstrate. Using 
longitudinal data on the math and reading outeomes of 4th through 8th grade students in 
Pennsylvania, our findings indieate that sehool value added provides very poor information for 
revealing prineipals’ persistent levels of effeetiveness. 

This manuseript has been aeeepted for publieation in Education Finance and Policy. 
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1. INTRODUCTION 

In recent years, policymakers have shown a keen interest in evaluating the effectiveness of 
sehool principals using performance measures based on student test scores. An inereasingly 
common approach is to measure principal performance using measures of school “value added,” 
which capture sehools’ contributions to student aehievement growth. In this article, we 
empirically assess the extent to which these models provide information about principal 
effectiveness rather than about factors beyond principals’ control that shape sehool performanee. 

Interest in evaluating principals is rooted in the widely held notion that effective leadership 
is an important eharaeteristic of sueeessful schools. This notion has a long history in qualitative 
studies of effective schools (Purkey and Smith 1983). Some recent quantitative evidence 
indieates that school performance is higher when principals are more experieneed (Clark, 
Martorell, and Rockoff 2009; Dhuey and Smith 2013),' have greater organizational management 
skills (Grissom and Loeb 2011), and demonstrate greater ability to reeruit and retain high-quality 
teachers while removing low-quality teachers (Branch, Hanushek, and Rivkin 2012; Loeb, 
Kalogrides, and Beteille 2012). 

Although principal characteristics and practices might help to identify effective principals, 
policymakers are particularly interested in using student outeomes directly to measure principal 
quality for purposes of accountability and incentives. States aeross the country have begun 
mandating the ineorporation of student aehievement growth into prineipal evaluations. As a 
condition for receiving either federal Race to the Top funds or enhanced flexibility under the 
Elementary and Secondary Education Act, more than 40 states have agreed to hold prineipals 

' Other research papers, such as Buck (2012) and Dhuey and Smith (forthcoming), find little relationship 
between principal experience and effectiveness at improving student achievement. 
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accountable for student aehievement growth. Nevertheless, they have faeed ehallenges in 
developing outcomes-based measures of prineipal quality for use in performanee evaluations. 

In the publie eye, prineipals are often viewed as effeetive when they lead schools that have 
high test scores. But test seore levels generally are related more strongly to student and family 
eharaeteristies than to a principal’s performance. Value-added models (VAMs) have the potential 
to provide effeetiveness information for prineipal evaluations because they seek to aeeount for 
student, family, and neighborhood influenees. VAMs have been examined extensively in the 
context of teacher and sehool effectiveness.^ If a VAM fully aecounts for influences beyond 
teaehers’ eontrol, it ean provide valid measures of a teaeher’s value added or “effeetiveness” — 
that is, the teacher’s eontribution to student aehievement growth.^ Similarly, if a VAM fully 
aceounts for out-of-sehool influenees on aehievement, it ean provide valid measures of a 
school’s value added."^ 

Only reeently have researehers applied the value-added methodology to principal 
effectiveness (Branch, Hanushek, and Rivkin 2012; Cannon, Figlio, and Sass 2012; Coelli and 
Green 2012; Dhuey and Smith 2013, fortheoming; Grissom, Kalogrides, and Loeb fortheoming; 
Lipscomb, Chiang, and Gill 2012). Estimating principal value added is not as straightforward as 
estimating teacher or school value added. The key analytieal ehallenge of devising a prineipal 

^ For example, see Aaronson, Barrow, and Sander (2007); Chetty, Friedman, and Rockoff (2011); Deming 
(2014); Deutsch (2012); Glazerman et al. (2010); Goldhaber and Hansen (2010); Kane and Staiger (2008); Ladd and 
Walsh (2002); McCaffrey et al. (2009); Rivkin, Hanushek, and Kain (2005); and Rothstein (2010). 

^ Throughout this article, we use the terms “value added” and “effectiveness” interchangeably, with both terms 
denoting contributions to student achievement growth. 

An additional concern with the validity of school value-added estimates is that a school’s estimated 
effectiveness in a given year may be influenced by its effectiveness in previous years, because schools, unlike 
teachers, serve many of the same students in consecutive years. Recent studies by Deutsch (2012) and Deming 
(2014), however, have validated commonly-used school VAMs by showing that they are unbiased predictors of 
experimentally identified school effectiveness estimates from school choice lotteries in Chicago and Charlotte- 
Mecklenburg. 
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VAM is to disentangle principals’ true eontributions to student aehievement growth from the 
influence of other school-level factors beyond principals’ control. The authors of existing studies 
have addressed this challenge by using sehool leadership ehanges to identify a principal’s 
effectiveness relative to other prineipals who have served at the same school. By controlling for 
school-level factors outside of prineipals’ eontrol that do not ehange over time, this approaeh ean 
yield plausibly valid estimates of prineipal effeetiveness. However, a major limitation of this 
approaeh is that it ean only evaluate prineipals of sehools that experienee a leadership transition 
during the period in whieh performanee is being evaluated. Therefore, this method eannot 
provide a useful basis for identifying prineipal quality in aetual evaluation systems, whieh must 
inelude all prineipals. 

In eontrast, sehool value-added estimates can be generated for all prineipals, but they are 
imperfeet indicators of principal quality. School value added eombines prineipals’ contributions 
to improved student outeomes with the eontributions of other sehool staff and resources. If 
principals exert a strong influenoe on a sehool’ s effeetiveness, then holding them aceountable for 
the overall effeetiveness of their sehool may be sensible from a management standpoint. Holding 
principals accountable for overall school effectiveness makes less sense if most of the variation 
in sehool effeetiveness is due to faetors outside of a prineipaTs eontrol. 

Several states reeeiving Raee to the Top funds and enhanced flexibility under the 
Elementary and Seeondary Edueation Aet have proposed to use measures of sehool-wide 
effeetiveness as indieators of principal effectiveness. Eor example, Ohio, North Carolina, 
Pennsylvania, and Tennessee are proeeeding to use school value-added estimates from the 
Edueation Value-Added Assessment System (EVAAS) in prineipal evaluations. Sehool value- 
added seores have been or will be a determinant of performanee bonuses for principals in several 


3 



School Value Added and Principal Quality 


incentive programs, ineluding the Teacher Advaneement Program and the Teaeher Ineentive 
Fund. 

In this artiele, we examine whether a sehool VAM is a valid tool for measuring prineipal 
effeetiveness. We base our conclusions on the degree to which schools’ effectiveness estimates 
are eorrelated with the effeets on student aehievement that their principals persistently 
demonstrate. The eore of our analysis assesses the strength of the assoeiation between two types 
of value-added estimates: (1) school value-added estimates and (2) prineipal value-added 
estimates that isolate principals’ effects from other persistent, unobserved differences aeross 
schools. To obtain these estimates, we use a longitudinal database of all Pennsylvania students in 
grades 4 through 8 from 2007-2008 to 2012-2013. After obtaining these value-added estimates, 
we employ a regression model to assess the extent to whieh differenees in sehools’ value added 
prediet differenees in the value added of their principals. 

Two key elements of our empirieal strategy are designed to isolate the portion of a 
principal’s effect that is consistent aeross time and student samples (and therefore presumably of 
greatest importanee to evaluators). First, the school and principal value-added estimates eome 
from different time periods. Our analysis therefore provides information about the degree to 
which school effectiveness estimates are indieators of persistent differenees in principals’ 
effectiveness. Seeond, we eompare school and principal effectiveness estimates obtained from 
distinet groups of students. This avoids eorrelated sampling errors — situations in whieh there is a 
spurious eorrelation between school effectiveness and prineipal quality because, for instanee, an 
unusually bright eohort of students is falsely inflating both a sehool’s value-added estimate and 
the value-added estimate of its principal. 
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Our findings pertain to a speeific population of principals. To have value-added estimates, 
prineipals in the analysis must have been involved in leadership transitions. They either beeame 
the new principal of a school or were replaeed by an ineoming principal during the final four 
years of our analysis period — 2009-2010 to 2012-2013. Our findings thus pertain to principals 
who either are in the first four years at their eurrent positions or are being eompared with other 
prineipals in the first four years at their positions when their effeetiveness is measured. This does 
not severely limit the external validity of our findings; in 2012-2013, 55 pereent of sehool 
principals in Pennsylvania were in their first four years at their current positions among those 
with students in the grades ineluded in our prineipal VAMs. 

Our findings indieate that school value added provides very poor information for evaluating 
prineipals’ persistent levels of effeetiveness. In both math and reading, there is no statistieally 
signifieant relationship between sehool value added and prineipals’ persistent levels of 
effeetiveness. The magnitudes of the relationships are also small, implying that no more than 7 
pereent of any given difference in true school value added between two sehools reflects 
persistent differenees in the effectiveness of their eurrent prineipals. 

Our study contributes to the literature on performance-based indicators of leadership quality 
that are based on individual-level data by assessing the validity of measures that eould be used in 
actual evaluations. In the most closely related study, Lipscomb, Chiang, and Gill (2012) found 
that principal and school effectiveness estimates are moderately eorrelated when those estimates 
pertain to the same sehool years and are obtained using the same student samples. We do not 
emphasize comparing school and principal effectiveness within the same time period and for the 
same students in this study beeause any observed relationship could reflect, in part, completely 
nonpersistent prineipal effeets and eorrelated sampling errors. That is, we are not merely foeused 
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on whether school and principal value-added estimates eome to the same (possibly erroneous) 
eonelusions; instead, we analyze how closely school value added reflects principals’ true and 
persistent levels of effeetiveness. 

The rest of the relevant research literature primarily has focused on describing the 
methodology for estimating prineipal effects and examining eharaeteristies of the effeetiveness 
distribution. Braneh, Hanushek, and Rivkin (2012) estimate that math seores in Texas are 0.1 1 
standard deviations higher in sehools led by prineipals whose within-sehool VAM estimate is 
one standard deviation higher. Dhuey and Smith (fortheoming) measure the eumulative effeets of 
middle-sehool principals in British Columbia, Canada over three grade levels, finding a within- 
sehool standard deviation of prineipal effeetiveness that is 0.36 in math and 0.21 in reading. In a 
subsequent paper, Dhuey and Smith (2013) measure the effeets of North Carolina principals, 
finding that principal-school match quality accounts for a significant portion of the variation in 
prineipal value added. Cannon, Figlio, and Sass (2012) also eonelude that prineipal mateh quality 
matters, based on evidence from Florida that the persistence of a prineipal’s value-added 
estimate deelines when a prineipal changes sehools. Coelli and Green (2012) explore models that 
allow principal effects to vary during the years of a principal’s tenure at a school, finding that 
principals need several years as a sehool’s leader to have their full effeet on student outcomes. 

Our findings do not eontradict the evidenee presented by other studies showing that 
prineipals differ meaningfully in their effeetiveness. Instead, our paper shows that differenees in 
prineipal effeetiveness are not strongly predicted by measures of sehool value added. Moreover, 
as demonstrated by Grissom, Kalogrides, and Loeb (2012), sehool value-added measures prediet 
non-test evaluation seores, sueh as observation-based and survey-based measures, more closely 
than prineipal value-added measures do. When eombined with our findings, this result suggests 
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that non-test evaluation measures may also be misattributing to prineipals the influenee of sehool 
factors outside a prineipal’s eontrol. 


2. EMPIRICAL METHODS 
Conceptual Framework 

The key objeetive of the empirieal analysis is to determine the extent to which school value 
added prediets prineipals’ effeetiveness. Toward this objeetive, it is useful to eonsider a simple 
framework that deeomposes sehool value added into various components. Let be the total 

true eontribution (without estimation error) of school s to student learning — that is, the 
effeetiveness or value added of school 5 — in grade g under the leadership of principal p in year t. 
The school’s value added refleets the value added of its prineipal {Pva^^,) in that grade and year 

as well as the influenee of all other sehool-level faetors outside of the prineipaTs eontrol: 




Measures of school effectiveness are informative of prineipal quality only insofar as they 
refleet Pva^^^ rather than . Differenees in aeross schools could arise from several types of 

faetors beyond prineipals’ eontrol. Some portion of the variation in teaeher quality aeross sehools 
eould be outside of prineipals’ diseretion. For example, aspects of a school’s physical location 
(such as accessibility or proximity to an education school) could make the school more or less 
attractive to good teachers. The composition of a school’s teaching force might also reflect hiring 


decisions made by distriet offiees. In addition, variation in F ^ , eould stem from differenees in 


financial resources, espeeially if sehools from different distriets are eompared. 

Of the variation in sehool effeetiveness that is due to principals’ effects, not all of the 
sourees of this variation are of equal interest in evaluating prineipals. We assume evaluators 


7 



School Value Added and Principal Quality 


want to gauge effeets on achievement that prineipals’ persistently demonstrate. In eontrast, 
transitory impaets — that is, evidence of effeetiveness that a prineipal exhibits in one year but not 
in the next year — are not relevant to predieting which individuals will be good sehool leaders in 
subsequent years. 

To eonsider the variation in prineipals’ effectiveness, we eoneeptualize total true prineipal 
effeetiveness in a partieular grade and year as the sum of four eomponents: 

^Pg + ^P< + ^Pg< ’ ( 2 ) 

where 6^ is the eomponent of a prineipaTs effect that is common across years and grades; 0^^ is 

the eomponent of a prineipaTs effeet that is speeifie to grade g but persistent aeross years; and 
0^^ and 0^^^ represent the prineipaTs nonpersistent impaets, respeetively, in speeifie years and 

grade-year eombinations. We define each of the four components of to be independent of 

the other eomponents. Moreover, we assume that the transitory components {0 and 0 ^^^ ) are 
independent of and that = 0 for g^g' , Cov{0 ^,,0 = Q for ti^t' , and 

Cov{0p^„0^^Y) = 0 for ^ ^ or t ^ f . 

In our notation, we assume that evaluators are interested in measuring [0^ -i- 0^^ ] for eaeh 

prineipal. The first of these two eomponents represents a persistent ability of prineipals to 
improve student outeomes aeross the grades at their sehools. The seeond eomponent reeognizes 
that prineipals may be more effeetive in some grades than in others on a persistent basis. For 
example, a principal might be partieularly knowledgeable about identifying appropriate eurricula 
for the early elementary grades but not the upper elementary grades. 

This framework yields the following key question to be addressed by the analysis: To what 
extent do dijferences in school value added across schools predict dijferences in the persistent 
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effectiveness of their current principals? In other words, the objective is to determine the extent 
to whieh differenees in5va^^^, prediets differenees in [0^ + 

We focus the analysis on predicting [9^ + 9^^] with a linear function of . Consider a 

scenario in whieh both and [6^ + 9^^] eould be observed perfeetly without error. If we ran 

a linear regression of [9^ + 9^^] on , the eoeffieient on , denoted by , would 

measure the difference in the persistent eomponent of principal value added that would be 

predieted from observing a one-unit differenee in sehool value added. A eoeffieient of one would 

imply that sehool value added was an unbiased linear predietor of prineipals’ effeetiveness. 

The size of P is determined by two key factors. These factors are highlighted by noting that 

Covje^ + , Sva,^^, ) Cov{0^ + + e^, + + cov(e^ + , c,,, ) 

VariSva^^^,) VariSva^^^,) VariSva^^^,) 

First, P is larger when variation in principals’ persistent levels of effeetiveness {Var{9p + 0pg)) 

represents a greater fraetion of total variation in school value added ( ). Conversely, 

variation in school value added due to non-principal factors or nonpersistent components of 

principal quality drives down P . Seeond, P is larger if more effective prineipals in a grade are 

assigned to schools with more positive additional influenees on student learning in that grade ( 

Cov{9p + 6pg,F^^^j) >0). Compensatory assignment — with more effeetive principals assigned to 

schools in which other faetors depress learning growth — will tend to mask principals’ impacts on 
their schools. 

In practice, we eannot direetly observe [6^ + dpg]- At best, we ean obtain estimates of 
prineipal value added based on a finite set of years and students. These estimates refieet not only 
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[Op + ] but also transient components of principal effectiveness and random sampling error 

stemming from any idiosyneratic eharaoteristies of the particular students used in estimation. 

Sehool value added also must be estimated on a partieular sample of years and students. 
Therefore, school value-added estimates also reflect transient eomponents of principal 
effectiveness and sampling error, among other faetors. If the years or students used to estimate 
school value added overlap with those used to estimate principal value added, then the two sets 
of estimates will be eorrelated, in part, due to refleeting the same values of 9 p, or 9p^, or the 

same idiosyneratie student eharacteristics. In this case, the eoeffieient from a linear regression of 
prineipal value-added estimates on school value-added estimates would be biased upward, 
measuring more than just the relationship between school value added and [Op + dpg]- 

Our empirieal strategy avoids these biases by estimating principal and school value added on 
independent samples — separate sets of years and students. As a result, the sehool value-added 
estimates ean be associated with the prineipal value-added estimates only as a result of being 
assoeiated with [0+0 ] . In the remainder of this seetion, we discuss the models used for 

estimating principal and school value added and our approach to assessing the relationship 
between these estimates. 

Estimating Principal Value Added 

To identify prineipals’ effeets on student aehievement, we exploit leadership transitions 
within sehools — that is, instances in whieh one principal replaces another at a sehool — and 
assess the within-school changes in student outeomes indueed by these leadership transitions. 
Variants of this strategy have been used in all prior researeh on principal value-added estimators. 
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To implement this approach, we estimate the following prineipal VAM for the outeome test 
score, y , of student i in grade g within school 5 led by principal p in year t: 

ytspg, = ^Lspgt^ + (4) 

where is a veetor of eovariates (without an intercept), Pva^^, is a grade-speeifie principal 
fixed effeet, a is a grade-speeifie school fixed effect, and is a student-level error term. 

The prineipal and sehool fixed effeets are eoefficients on veetors of prineipal and sehool dummy 
variables, respeetively, and represent average effeets aeross all years used in the analysis.^ As we 
diseuss later, for the purposes of our empirical strategy it is advantageous to estimate Equation 
(4) separately by grade. 

The outeomes of interest are math and reading seores on state assessments, whieh we 
standardize to have mean zero and standard deviation one within the full statewide population of 
test takers in eaeh grade and year. The student-level covariates inelude prior-year test seores in 
both math and reading, gender, race/ethnicity, free meals and reduced-priee meals partieipation, 
English language learner (ELL) status, speeial edueation status, and whether the student’s 
outeome and baseline tests had modifieations. We also eontrol for school-by-year-by-grade 
averages of various student-level eharaeteristies (gender, raee/ethnieity, subsidized luneh status, 
ELL status, and speeial edueation status) and year dummies. 


* Because students might be observed in more than one school in a given year due to mobility, the “dummy” 
variables are not purely dichotomous. For each student-by-year observation, we allow the value of a school 
“dummy” variable to vary continuously from 0 to 1 to represent the implied fraction of the school year in which the 
student was attending the given school in the given year. For each student-by-year observation, the value of a 
principal dummy variable is identical to the value of the dummy variable for the school that the principal leads in 
that year; therefore, principal dummy variables can also vary continuously from 0 to 1. We do not have exact 
measures of the fraction of the year in which a student attends a school; we impute this fraction with the reciprocal 
of the number of schools in which the student is observed in a given year. 
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To estimate Equation (4), we use data on outeome seores from the sehool years 2008-2009 
to 2012-2013 for only those students attending a sehool that ever experienced a leadership 
transition during that period. Prineipals in eharge of fewer than 20 student-years during the 
analysis period are exeluded from the sample. 

Eaeh prineipal’s estimated fixed effect, Pva^^. , serves as the principal’s value-added 

estimate. Differences in value added aeross principals are measured in terms of standard 
deviations of student-level seores. Beeause the model eonditions on sehool fixed effeets, only 
changes over time in the identities of principals leading a school are used to identify the prineipal 
fixed effeets. This purges the estimated principal effects of all unobserved, school-speeific 
influences on student aehievement that remain invariant over the period of analysis, sueh as 
persistent between-sehool differenees in resourees or neighborhood quality. 

Despite improving the causal validity of the estimated prineipal effects, there are several 
eosts to ineluding sehool fixed effects in the prineipal VAM. First, the VAM generates 
effeetiveness estimates only for principals who have led schools with a leadership transition 
during the analysis period. Seeond, for those prineipals with estimates, the VAM allows eaeh 
prineipal to be eompared only to a limited set of other prineipals. Naturally, the most direct 
comparisons are between prineipals who have served at the same school. If, after eontrolling for 
all eovariates, student outcomes at a given sehool are better under a suecessor than under a 
predeeessor, then the sueeessor will have a more positive value-added estimate than the 
predeeessor. 

Beyond these direet eomparisons, there is a somewhat broader — but still limited — group to 
whieh eaeh prineipal can be eompared. Comparisons can be made among prineipals who have 
served in the same eonneeted network of sehools, where a network is a set of sehools in whieh 
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every sehool has had at least one prineipal transfer to at least one other sehool in the network 
during the analysis period. We use the phrase “network” to refer, interchangeably, to either the 
set of sehools eonneeted by these transfers or the set of prineipals who ever served in any of 
those schools. Figure 1 provides a hypothetieal example of how comparisons can be made within 
a network of two schools (schools A and B) and three principals (principals 1, 2 and 3). Based on 
this scenario, the VAM makes a eomparison within school B to estimate that prineipal 3 is more 
effeetive than prineipal 1 by 0.2 units, and makes a eomparison within sehool A to estimate that 
principal 1 is more effective than principal 2 by 0.2 units. By the assumption of transitivity, 
prineipal 3 is deemed to be more effeetive than prineipal 2 by 0.4 units. In eontrast, without the 
inelusion of sehool fixed effeets, prineipal 2 would be regarded as more effeetive than principal 3 
due to the faet that student outeomes are generally better at sehool A than at sehool B. 

As noted by Dhuey and Smith (forthcoming), the raw estimates of Equation (4) foree one 
prineipal per network to have a value-added estimate of zero. We follow previous studies in re- 
centering value-added estimates by network, so that the resulting estimate for every principal is 
expressed relative to the average principal in the same network. We reealeulate standard errors of 
the value-added estimates aecordingly. 

As we diseuss in Seetion 3, the vast majority of prineipals with VAM estimates from 
Equation (4) belong to networks that are no larger (and usually smaller) than the seenario shown 
in Figure 1 . Therefore, the prineipal VAM generally cannot be applied to a real-world evaluation 
system that seeks to assess a principal’s effectiveness relative to a much broader comparison 
group. Nevertheless, Equation (4) represents the best available method for obtaining prineipal 
effectiveness estimates purged of time-invariant, sehool-specifie influenees beyond the 
prineipals’ eontrol, and we ean use these estimates to assess the usefulness of more widely 
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applicable performanee measures — measures of school effectiveness — as tools for evaluating 
prineipals. We turn next to our strategy for estimating school effectiveness. 

Estimating School Value Added 

We estimate the effeetiveness of eaeh prineipal’s sehool in the 2007-2008 school year — a 
year prior to the period on whieh the prineipals’ own effeetiveness measures are based. Using 
student outcome data from that year, we estimate a school VAM of the following form, 
separately by grade: 

fisps, 2007-08 “ ,2007-08^ spg ,2007 -0» ^ispg ,2007-08 ’ 

where 2007-08 the grade-speeific fixed effeet for sehool s, eapturing the value added of 

that sehool in 2007-2008. All other variables of Equation (5) are defined as before. We eontrol 
for the same eovariates as in the prineipal VAM except for the year dummies and sehool-by- 
year-by-grade averages of student characteristies, given that Equation (5) is estimated on one 
year of outcome data. The schools in the estimation sample for Equation (5) eonsist only of 
sehools that have at least 20 students in the given grade in 2007-2008 and are led by prineipals 
who have principal value-added estimates from Equation (4). For direct comparability with the 
prineipal value-added estimates, we group sehool value-added estimates from 2007-2008 into 
networks aecording to the networks to whieh their principals belong in the 2008-2009 to 2012- 
2013 period (the years of the principal VAM). We then re-eenter the sehool value-added 
estimates by network. 
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Estimating the Association Between School and Principal Value Added 

In the final stage of the analysis, we examine the extent to whieh prineipals’ value added in 
the 2008-2009 through 2012-2013 sehool years can be predieted by the value added of the 
sehools that they led in an earlier year, 2007-2008. Our basic approach is to estimate a 
regression at the prineipal-by-grade level, with grade-speeifie prineipal value-added estimates 
regressed on school value-added estimates from the same grade. 

Our empirieal strategy removes faetors that eould otherwise lead to spurious assoeiations 
between the principal and school value-added estimates. As stated earlier, the advantage of 
estimating prineipal and school value added from different sets of years is that eompletely 
transitory elements of prineipal effeetiveness ( 6 and ), whieh are likely not of interest to 

evaluators, eannot eontribute to a eorrelation between the two types of value-added estimates. 
Another advantage is that the same students do not eontribute to both the sehool and prineipal 
value-added estimates being paired. This prevents eorrelation in the sampling error of the two 
estimates. 

To do so, we pair grade-speeific sehool value-added estimates from 2007-2008 with same- 
grade prineipal value-added estimates from 2008-2009 through 2012-2013. This approaeh 
yields elose-to-independent samples, with the only overlap eonsisting of a small number of 
students who repeat the same grade aeross years. Formally, our dataset consists of the pairs 

2007-08 > g ’ whieh we refer to as prineipal-grade observations, for each of the grades 

g from 4 to 8. 

To estimate the assoeiation between sehool and prineipal value added, we pool together all 
prineipal-grade observations aeross all grade levels and estimate the following speeifieation: 

= K^vfl,^^_2oo7-o8) + (6) 
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where ^ is a veetor of network dummies. Inelusion of network dummies appropriately forees 

comparisons to be made only among principals in the same network. To improve the effieiency 
of the estimates of Equation (6), we weight eaeh prineipal-grade observation by the inverse of 
the squared standard error of the prineipal value-added estimate. 

Of eentral interest in the study is whether the eoeffieient estimator from Equation (6), f, 
provides a consistent estimator of the true relationship {/^) between sehool value added and the 
persistent component of principal effectiveness. To examine this, it is instructive to express each 
set of value-added estimates in terms of various eomponents. Let T denote the set of five school 
years (2008-2009 through 2012-2013) on whieh the prineipal VAM is estimated. Using 
Equation (2) and the faet that Pva^ ^ . measures a prineipal’s effectiveness with some sampling 

error, we have 


+ ^P,s + tZ + ^P,g 

^ t€T ^ teT 


(V) 


where ^ represents sampling error and all other terms are defined as in Equation (2). A similar 
deeomposition based on Equations (1) and (2) yields 

2007-08 ~ ^p ^ pg 2007-08 ^/7g, 2007-08 2007-08 ^.sg, 2007-08 

where 2007-08 represents sampling error and all other terms are defined as in Equations (1) and 

( 2 ). 

Using the faets that 9^^ and 9^^^ are uneorrelated aeross years and the sampling errors from 


(7) and (8) are uneorrelated with each other, f eonverges in probability to the following: 

^ ^ Cov{Pvcip ^ ^ pg) p ^ pg ’ 2007-08) 

yar(Sva,^^ 2007-08) ^«^(-^^«.pg,2007-08) + ^«^(Ug,2007-08) 
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2007-08) 

yar(Svaj^^ 2007-08 ) ^'^^(^^, 2007-08 ) 

Thus, Y converges to the true desired relationship, P , multiplied by a faetor less than one, with 
that faetor representing the reliability of the sehool value-added estimates. This result illustrates 
the well-known faet that measurement error in the independent variable — whieh, in this ease, 
stems from the use of a finite student sample to estimate sehool value added — leads to 
attenuation bias in its eoeffieient estimate. 

As done in Jaeob and Lefgren (2005, 2008), we eorreet the attenuation bias by eonverting 
Sva^p^ 2007-08 empirieal Bayes (EB) estimate. The EB adjustment “shrinks” or pulls the 

ordinary least squares (OLS) estimate toward the average estimate in a defined group of 
prineipals by a faetor equal to the unreliability of the OLS estimate. Substituting the EB estimate 
for the OLS estimate of sehool value added on the right-hand side of Equation (6) exaetly offsets 
the attenuation bias, enabling 7 to be a eonsistent estimator of P . 

To eonstruet the EB estimates of sehool value added, we first take the within-network 
sample varianee of the OLS estimates and subtraet the average squared standard error of those 
estimates.^ This differenee, denoted by , measures the varianee of true sehool value 

added — that is, the varianee that would oeeur if sehool value added were observed without error. 
The EB estimate of the value added of sehool s in grade g is then equal to the eorresponding 
OLS estimate multiplied by {VpJ'f’’') / (VpJ'f"' + SE^^) , where SE is the standard error of the OLS 
estimate. This approaeh implieitly shrinks the OLS estimates toward zero, given that the OLS 

^ In these calculations, each principal-grade observation continues to be weighted by the inverse of the squared 
standard error of the principal value-added estimate, for consistency with how Equation (6) is estimated. 
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estimates of sehool value added average to zero within eaeh network. We then substitute the EB 
estimates of sehool value added in plaee of the OLS estimates in Equation (6). 

In short, our empirieal strategy produees a eonsistent estimator of the extent to whieh sehool 
value added ean prediet persistent elements of a prineipal’s effeetiveness. The eoeffieient 
estimate from Equation (6) ean be used to answer the following question: For any given 
dijference in true school value added between two schools, what proportion of that difference, on 
average, reflects persistent differences in the effectiveness of their current principals? 

As an alternative way to gauge how strongly sehool value added signals prineipals’ 
effeetiveness, we also transform the regression eoeffieient into an R-squared value, again 
adjusted for estimation error. The R-squared answers the following question: Even without 
estimation error, what proportion of the variation in principals’ subsequent effectiveness can be 
predicted from information revealed by school value added? To obtain the R-squared value, we 
square the regression eoeffieient and multiply it by ) , where Vfff”' is defined 

above and is the varianee of true prineipal value added in the absenee of sampling error, 

whieh we estimate using the same methods used to estimate Vfff”' . 

When interpreting the estimates of Equation (6), it is important to eonsider the types of 
prineipals to whom these results ean be generalized. In order to have prineipal value-added 
estimates, all prineipals in the analysis must be involved in a leadership transition between some 
pair of sueeessive years during the analysis period of the prineipal VAM. This means that in at 
least one of the final four years of that period — 2009-2010 through 2012-2013 — the prineipals 
in the analysis either started leading a sehool they had not led before or were replaeed by an 
ineoming prineipal. Therefore, the results of this paper indieate how well sehool value added 
prediets prineipals’ effeetiveness when prineipal value added is measured in the prineipals’ first 
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four years at their positions or is measured relative to others who are in their first four years at 
their positions. As indieated above, 55 pereent of Pennsylvania prineipals leading sehools 
serving grades 4 through 8 in 2012-2013 began their eurrent position no more than four years 
earlier. Thus, our findings are relevant to a majority of the state’s prineipal workforee. 


3. DATA AND DESCRIPTIVE STATISTICS 
Data Sources 

Our data eome from longitudinally-linked student-level files obtained from the Pennsylvania 
Department of Edueation on all publie sehool students in the state. The first set of files eontains 
student aehievement seores from the Pennsylvania System of Sehool Assessment (PSSAs) in 
math and reading for grades 3 to 8 from 2006-2007 to 2012-2013. The PSSAs are standardized 
tests that Pennsylvania uses for eomplianee with federal sehool aeeountability polieies. Nearly 
97 pereent of all Pennsylvania students in these grades have a PSSA sealed seore from 2012- 
2013.^ We link these seores to a seeond set of files eontaining administrative reeords from the 
Pennsylvania Information Management System (PIMS) on students in grades 4 to 8 from 2007- 
2008 to 2012-2013. PIMS data inelude information on students’ gender, raee/ethnieity, free and 
redueed-priee meal partieipation, ELL status, and speeial edueation eategory. PIMS also allows 
us to link students to the sehools they attended during the year and the prineipal that led eaeh 
respeetive sehool. Altogether, we have eurrent and prior aehievement seores and other 


’ The PSSAs were available in both regular and modified versions until 2012-2013 when the modified version 
was discontinued. Two percent of all students with scores took the modified test when it was offered, which was 
intended for some special education students. We include these scores in the VAMs with an indicator for taking the 
modified version. Of students without any PSSA score, nearly half took the Pennsylvania Alternate System of 
Assessment (PASA). The PASA is intended for students with severe cognitive impairments. 
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information on students from six sehool years. We use these data to study sehool effeetiveness in 
the first year and prineipal effeetiveness aeross the five subsequent years. 

Principal Transitions 

As indieated by Table 1, 41 pereent of Pennsylvania’s sehools serving 4th through 8th 
graders experieneed at least one leadership ehange between 2008-2009 and 2012-2013.^ The 
annual rate of sehool leadership ehanges during these years ranged between 10 and 12 pereent. A 
total of 3,269 prineipals served at 2,532 total sehools during these five years; of these prineipals, 
1,929 served at one of the 1,034 sehools where a leadership transition oeeurred.^ We estimate 
value-added seores for these 1,929 prineipals using the estimation strategy deseribed in Seetion 
2, whieh yields effeetiveness data for 59 pereent of all prineipals in our sample. 

Relative eomparisons of prineipal effeetiveness are made within small eonneeted networks 
of sehools and prineipals. In Table 2, we report the number of networks and their sizes. We foeus 
on grade-speeifie networks in the table beeause our prineipal effeetiveness models are estimated 
separately by grade. Of 2,155 eonneeted networks representing 5,238 prineipal-grade 
observations, 1,629 networks (76 pereent) inelude just one sehool, whieh means that the sehool 


* The percentage of Pennsylvania’s schools serving 4th through 8th graders that experienced at least one 
leadership transition increases to 48 percent if we include data from 2007-2008, which we use for calculating school 
effectiveness estimates. 

^ Some schools appear in the data to be led either by co-principals or jointly by a single principal. We restricted 
our sample to only include principals, schools, and years in which a single principal led a single school. This reduces 
the number of principals involved in leadership changes that we report in any pair of adjacent years in Table 1. For 
instance, the principal VAM would record one total transition between 2008-2009 and 2012-2013 for a school led 
by principal A in 2008-2009, jointly by principals A and B in 2009-2010, and then by principal B alone starting in 
2010-2011. Transitions would not be recorded between adjacent years because 2009-2010 would be excluded for 
that school. 
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experieneed a leadership transition but none of the prineipals involved were observed at any 
other sehool. Only 57 networks inelude four or more sehools. 

The high frequeney of single-sehool networks in the data means that most prineipals are 
being eompared to their predeeessor or sueeessor only. Although sueh a narrow set of 
eomparisons would not suffiee for a real evaluation system, it still permits a elear test of the 
validity of sehool value added. A necessary (although not suffieient) eondition for sehool value 
added to be a valid measure of prineipal effeetiveness is that it should identify whieh prineipals 
are better than others in a subsequent time period, even if eaeh prineipal is just being eompared 
to one other prineipal. Studies to validate teaeher value-added measures (Kane et al. 2013; 
Chetty, Friedman and Roekoff 2014) have taken a similar approaeh, examining whether teaeher 
value-added estimates ean prediet future performanee differenees aeross teaehers that are 
rigorously estimated (through experimental or quasi-experimental methods) but eonfined to 
narrow eomparisons within the same grade and sehool. 

We ean inelude only a subset of the 5,238 prineipal-grade observations on 1,929 prineipals 
in the final analysis due to several neeessary sample restrietions. First, we require both that a 
prineipal’s own effeetiveness estimate must be based on assessment data from at least 20 
students and that another prineipal in the same network must have an effeetiveness estimate 
based on at least 20 students taking the PSSA. We exelude estimates for prineipals not meeting 
these thresholds beeause those estimates are likely to be heavily infiueneed by the performanee 
of a few students. As indieated by Table 3, imposing this restrietion reduees our sample to 5,059 
prineipal-grade observations on 1,881 prineipals. Seeond, we ean only use effeetiveness data on 
prineipals who led a sehool with students in the same grade in 2007-2008, redueing our sample 
further to 2,001 prineipal-grade observations on 802 prineipals. Third, eaeh prineipal network in 
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the final analysis sample must inelude at least two prineipals after the sample is restrieted to 
prineipals who had led sehools in 2007-2008. This restrietion removes from the sample many 
prineipals who were originally in two-prineipal networks beeause the other prineipal was 
removed through either of the first two restrietions. This restrietion ensures that all prineipals in 
the final analysis sample have at least one other prineipal to serve as a basis for eomparing 
effeetiveness estimates. Following these reduetions, our final analysis sample ineludes 673 
prineipal-grade observations on 291 prineipals. 

Characteristics of Principals and Students 

We provide means and standard deviations on several prineipal and student eharaeteristies in 
Tables 4 and 5, respeetively. Table 4 shows that the professional and demographie baekground 
eharaeteristies of prineipals in the final sample were generally similar to those of the statewide 
population of prineipals. The few differenees observed were small; for instanee, prineipals in the 
final sample were slightly less likely than all prineipals in the state to be white (80 versus 87 
pereent) or to have at least a master’s degree (78 versus 85 pereent). Prineipals in the final 
sample had an average of two more years of total experienee in K-12 edueation, whieh is not 
surprising given that they needed to have led a sehool in both 2007-2008 and at least one 
subsequent year. 

The student-level eharaeteristies in Table 5 inelude eurrent- and prior-year PSSA seores, 
gender, raee/ethnieity, free and redueed-priee meal partieipation, ELL status, and speeial 
edueation status. These variables are used in the prineipal and sehool VAMs. In the first eolumn 
of data, we show statisties for Pennsylvania students in grades 4 to 8 from 2007-2008 to 2012- 
2013. Average test seores in z-seore units are not exaetly zero beeause we only inelude students 
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with complete data on all of the characteristics. In the next two columns, we show statistics on 
the same variables for students attending sehools led, respeetively, by prineipals with 
effeetiveness estimates and by prineipals in the final analysis sample. The last eolumn shows 
deseriptive statisties for students in grades 4 to 8 in 2007-2008 who eontribute to sehool VAM 
estimates used in the final analysis sample. Overall, the data indieate that sehool leadership 
ehanges are more frequent in sehools with lower-performing students. In addition, the samples 
for the final analysis have larger proportions of students eligible for free meals and blaek 
students than are typieal aeross Pennsylvania, suggesting that our results are based on students 
who are more soeioeeonomieally disadvantaged. 


4, RESULTS 
Main Findings 

Table 6 presents findings from the estimation of Equation (6), assessing the extent to whieh 
sehool value added prediets subsequent prineipal value added. We show separate results for math 
and reading outeomes. The eoeffieient estimates — 0.07 for math and -0.03 for reading — are 
statistieally insignifieant.^® The point estimates indieate that no more than 7 pereent of any given 
differenee in value added between two sehools refleets persistent differenees in the effeetiveness 
of their eurrent prineipals. As an alternative interpretation of this relationship, the implied R- 
squared value for the math estimate indieates that information about prineipals’ persistent levels 
of effeetiveness revealed by sehool value added explains only 1 pereent of the variation in 
prineipals’ subsequent effeetiveness. In both subjeets, sehool effeetiveness does not appear to 

We also estimated models where school effectiveness in reading (math) was used to predict principal 
effectiveness in math (reading). The relationships were statistically insignificant in both subjects. 


23 



School Value Added and Prineipal Quality 


provide evaluators with useful information for predieting the subsequent performanee of 
prineipals. 

Our eentral finding that sehool value added is a poor predietor of prineipal value added 
would be meaningless unless our sample aetually had eonsiderable variation in prineipal value 
added. In faet, prineipals in the sample vary substantially in their value added, even within the 
same network. We find that the within-network standard deviation of prineipal effeets — after 
removing variation attributable to random sampling error — is 0.14 student-level standard 
deviations in math and 0.1 1 student-level standard deviations in reading. To put this variation in 
eontext, the standard deviation of prineipal effeets in our sample is at least 80 pereent of the size 
of the standard deviation of teaeher effeets based on Hanushek and Rivkin’s (2010) synthesis. 
Nevertheless, very little of this variation in prineipal effeetiveness ean be predieted by sehool 
value added. 

We next eonduet several additional analyses to assess the robustness of the main findings, 
ineluding exploring four alternative VAM speeifieations, different samples of prineipal-grade 
observations, several subgroups of prineipals, and measures of prineipal value added that aeeount 
explieitly for the possibility that prineipals’ effeets are only gradually manifested after starting to 
lead a sehool. 

Alternative VAM Specifications 

There is eurrently debate among researehers about how to measure sehool and prineipal 
effeets using a VAM, so it is important to explore the sensitivity of our main findings to 
alternative approaehes. For example, Ehlert, Koedel, Parsons, and Podgursky (2013) eompared 
sehool effeets obtained under different VAM speeifieations. While the eorrelations aeross 
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speeifleations typieally were at or above 0.90, speeifieations that did not aggressively eontrol for 
student baekground eharaeteristies were more likely to assign higher effeetiveness seores to 
sehools serving larger shares of students with more-advantaged eharaeteristies. The authors 
reeommend a two-step estimation strategy where sehool effeets (and by extension prineipal 
effeets) are eompletely orthogonal to baekground eharaeteristies. They argue that sehool fixed 
effeets otherwise are potentially biased beeause the eoeffieients on the student baekground 
variables are attenuated due to relatively little within-sehool variation in the eovariates. 

We replieate our main analyses using the Ehlert et al. (2013) approaeh along with three 
additional eoneeptualizations^^: (1) using student growth pereentiles as the outeome variable in 
lieu of ineluding eovariates in the VAMs, (2) estimating prineipal value added without 
eonditioning on students’ prior-year seores, and (3) estimating prineipal value added omitting a 
prineipaTs first year leading a sehool. The student growth pereentiles model (see Betebenner 
2009) is a well-known approaeh that is used in Colorado and in other distriets and states. The 
speeifieation that exeludes students’ prior-year seores avoids any potential eoneem that those 
prior-year seores are endogenous to the eurrent prineipaTs effeetiveness if the prineipal was also 
leading the same sehool in the prior year. Identifieation of prineipal effeets thus eomes from 
within-sehool ehanges in test seore levels assoeiated with leadership transitions. Finally, the 
speeifieation that exeludes the first year of a prineipaTs tenure as a sehooTs leader from the 

" To apply the Ehlert et al. (2013) approach, the first step for both the school and principal VAMs is to regress 
student outcome scores on all eovariates and to obtain the residuals. The second step is to regress the residuals on 
school dummy variables (in the case of the school VAMs) or both school and principal dummy variables (in the case 
of the principal VAMs). 

'2 We carry out a nonparametric version of this model by first calculating the percentile of each student’s 
outcome score among all students in the same grade and year who earned the same scale score on the previous 
year’s assessment in the same subject. We then regress these percentiles on school dummy variables (in the case of 
school VAMs) or both school and principal dummy variables (in the case of principal VAMs). 
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principal VAM purges any immediate deeline in either student aehievement or the sehool 
environment that may result from prineipal transitions. To the extent that the transition year is 
marked by some degree of upheaval, exeluding this year in the prineipal VAM eould raise the 
eorrelation between value added estimates of prineipals and the sehools they formerly led. 

In Table 7, we report the findings using alternative speeifieations for math and reading. 
Aeross all of these speeifieations, sehool value added is never a statistieally signifieant predietor 
of subsequent prineipal value added. In all but one estimate, the magnitudes of the relationships 
between sehool and prineipal value added are small, implying that no more than 12 pereent of 
any given differenee in true sehool value added serves as a signal of prineipals’ persistent levels 
of effeetiveness.^^ 

Using Reverse Predictions 

The next analysis uses an alternative sample of prineipal-grade observations that is 
eonstrueted to eapture a similar relationship between sehool value added and prineipals’ 
persistent levels of effeetiveness as in Table 6. The alternative sample eomes from “reverse 
predietions” that use future sehool effeetiveness estimates to prediet prineipal effeetiveness in 
previous years. That is, for eaeh of the grades 4 to 8, we pair a sehooTs effeet in that grade from 
the 2012-2013 year with the prineipaTs effeet between 2007-2008 and 2011-2012 based on the 
same grade. Reverse predietions ean be used in our analytie framework beeause measuring the 
relationship between sehool value added and prineipals’ persistent levels of effeetiveness should 


The only exception is the analysis that excludes the first year after leadership transitions from the principal 
VAM. The math coefficient (0.51) is large but also noisy. 
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not depend on the ehronologieal order of when eaeh estimate is obtained, provided they are still 
based on different time periods and student samples. 

We show findings based on the reverse predietions in Table 8. The estimated relationships 
for both subjeets eontinue to be small and statistieally insignifieant. Beeause this robustness 
analysis is designed to apply the same empirieal method to a different set of observations, the 
findings — along with those from using alternative VAM speeifieations — reinforee our 
eonelusion that sehool value added provides poor information about prineipals’ persistent levels 
of effeetiveness. 

Principal Subgroups 

In Table 9, we report findings from estimating Equation (6) separately for elementary sehool 
prineipals, middle sehool prineipals, and prineipals outside of Philadelphia and Pittsburgh. We 
exelude Philadelphia and Pittsburgh beeause offieials in those distriets presumably have the 
greatest opportunity to make eompensatory assignments of more effeetive prineipals to less 
effeetive sehools by virtue of managing the most sehools in Pennsylvania. Sinee prineipal 
transitions brought about by eompensatory assignments reduee the observed eorrelation between 
sehool effeetiveness and prineipal quality, exeluding these distriets may inerease the magnitude 
of the estimates. In all three analyses, however, we eannot rejeet the null hypothesis that sehool 
effeetiveness estimates provide no indieation of prineipals’ persistent levels of effeetiveness. 

Accounting for the Gradual Manifestation of Principals’ Effects 

We next eonsider the possibility that the full effeets of prineipals are not manifested 
immediately after assuming a sehooTs leadership due to the lingering effeets of the predeeessor 
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principal. For instance, although staffing decisions could be a key ehannel through whieh 
prineipals affeet student outeomes (Braneh et al. 2012), it may take time for a newly arriving 
prineipal to replaee teaehers who were hired by the previous prineipal. Likewise, a newly 
arriving prineipal may only gradually be able to ehange attitudes, expeetations, and other aspeets 
of a sehool’s eulture that were fostered by the predeeessor. 

Sehool value added ought to be judged by how well it prediets prineipals’ full effeets, 
beeause those effeets are the most eomplete representation of a prineipal ’s eapabilities as a 
leader. However, it is possible that the prineipal value added estimates in our analysis may not 
yet eneapsulate a prineipal ’s full effeet. Our analysis is based on eomparisons between departing 
prineipals and sueeessors who started their positions within the last four years, so there may not 
have been suffieient time for sueeessors to fully undo the infiuenees of the predeeessors. The 
eoneem, then, is that the varianee in the prineipal value added estimates may be artifieially 
eompressed, leading to a downward biased eoeffieient on sehool value added when predieting 
prineipal value added. 

Our solution is to inflate the estimated within-sehool differenees in prineipal value added to 
the extent that those differenees are likely to be artifieially eompressed. Adapting the framework 
used by Coelli and Green (2012), we defined the leadership effect at sehool 5 in year t, , to be 
the eombined effeet of deeisions and aetions made by the eurrent and prior prineipals on a 
sehool ’s effeetiveness in the eurrent year. The leadership effeet is a weighted average of the 
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previous year’s leadership effeet and the full underlying effeet of the eurrent prineipal {9^), with 
weight p on the previous year: 

= (10) 
Our existing estimate for the within-sehool differenee in value added between two prineipals 
is aetually the differenee in leadership effeets between the period under a sueeessor and the 
period under a predeeessor. Consider a sehool in whieh we observe exaetly one predeeessor 
{p=a) and one sueeessor {p=b) during the five school years (t=l,. . .,5) of the prineipal VAM, 
with the sueeessor starting in year t*. Normalizing L^j = 9^ and eondueting repeated forward 
substitution of Equation (10), we ean see that for every year t>t* the leadership effeet is 

= p^~'*^^9^ +(1-/?'“'*^')^^ . Therefore, the average differenee in leadership effeets between the 


sueeessor’s period and the predeeessor’s period (t=l,...,t*-l), whieh is the quantity 

eaptured in our existing prineipal value added estimates, is 


' “T 1 I — 1 Y — L “r 1 J 


If p> 0 — that is, if the predeeessor’s influenee lingers into the sueeessor’s tenure — then 
our estimated differenee in the prineipals’ value added is less than the full underlying differenee 
in their effeetiveness, 9^-9^. However, Equation (11) makes elear how to reeover the full 


underlying differenee in effeetiveness: multiply the estimated differenee by the inflation faetor 



For ease of exposition, in this framework we abstract away from idiosyncratic components of principal 
quality, such as completely transitory or grade-specific components. Introducing those components would not yield 
extra insights from this framework. 
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Our data do not afford a sufficiently long time series to estimate p with preeision. For this 
sensitivity analysis, we borrow the estimate of p from Coelli and Green (2012), who find 
p ~ 0.7 for aehievement outeomes. We then inflate the within-sehool differenees in the prineipal 
value added estimates by the inflation faetor deseribed above. 

The results, shown in Table 10, yield similar eonelusions as our main findings. Sehool 
value added has no signifioant relationship with the effeetiveness that prineipals demonstrate in 
future years. Given the negative point estimates, we eannot have any eonfidenee in the elaim that 
a sehool whose value added exeeeds another has the more effeetive prineipal. 


Using Value-Added Estimates Based on Multiple Grades and Subjects 

Thus far, our analyses have assessed the extent to whieh grade-speeifie sehool value added 
ean prediet the prineipal’s subsequent value added in the same grade. As deseribed earlier, 
requiring the sehool and prineipal value-added estimates that are paired with eaeh other to be 
based on the same, single grade (from different years) effeetively ensures that the same students 
do not eontribute to both estimates and, henee, avoids spurious eorrelations between the 
estimates. However, in praetiee, the sehool value-added estimates that state and distriet offieials 
see are likely to be those that refleet a sehool ’s effeet averaged aeross all tested grades that it 


We do not apply the inflation factor to the school value-added estimates. School value-added estimates do 
not, by definition, adjust for principals’ length of service since those estimates are focused on the effectiveness of a 
school’s entire staff Moreover, given that school value added is the independent variable for predicting principal 
value added, inflating the variance in that independent variable would actually decrease the coefficient on that 
variable — ^the opposite of our intention. 

For the analyses in Table 10, we narrow the sample to principals who served in a school with only one other 
principal (predecessor or successor) during the period of the principal VAM, and we narrow the definition of a 
network to be a single school only. This guarantees that every principal’s value-added estimate is based solely on a 
comparison with one other principal who served at the same school. Due to this limitation in the sample and network 
type, we regard this approach as a sensitivity analysis only. 
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serves. Therefore, as a more direet evaluation of sehool value-added estimates that are likely to 
be used in praetiee, we assess how well a sehooTs average effeet across all tested grades in 
2007-2008 predicts the future value added — also averaged across all tested grades — of the 
sehooTs principal. Despite having the advantage of using value-added estimates that are more 
aligned with practice, this analysis — unlike the main analysis — is vulnerable to bias from 
eorrelated sampling errors. For instanee, an unusually bright group of fourth graders that inflates 
a sehool’s value-added estimate in 2007-2008 eould, upon reaching fifth grade the next year, 
inflate the same prineipal’s subsequent value-added estimate, indueing a spurious correlation 
between the two estimates. 

The findings, shown in Table 11, are nevertheless eonsistent with our main findings. The 
first two columns of results, whieh foeus on math and reading separately, fail to rejeet the 
hypothesis that a school’s value added is uninformative of the same prineipal’s subsequent value 
added. The analysis in the final eolumn uses composite value added estimates that are averaged 
aeross the two subjeets, given that sueh eomposites are also used in praetiee.'^ Again, school 
value added has no statistically significant relationship with a prineipal’s subsequent value 
added. Aeross all of the analyses in Table 1 1, no more than 14 pereent of any given difference in 


We take the weighted average of a school or principal’s unshrunken grade-specific value-added estimates, 
with weights equal to the effective number of students contributing to each value-added estimate. As in all of our 
analyses, the school value-added estimates are then shrunk with the usual empirical Bayes adjustment. 

** For each school and principal, we first take the simple average of the unshrunken math and reading value- 
added estimates. We calculate the sampling variance of the unshrunken composite estimate as one-fourth of the sum 
of three terms: the sampling variance of the math value-added estimate, the sampling variance of the reading value- 
added estimate, and twice the covariance of the math and reading value-added estimates (calculated as the 
covariance of reading and math test-score residuals from the school value-added model, divided by the effective 
number of students contributing to the school or principal’s value-added estimate). Finally, the school value-added 
estimates are shrunk with the usual empirical Bayes adjustment. 
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value added between two sehools represents persistent differenees in the effeetiveness of the 
schools’ current principals. 

Concurrent Relationships between School and Principal Value-Added Estimates 

In our final analysis, we show how a more naive approaeh to evaluating the validity of 
school value added would have eome to a qualitatively different eonelusion than our main 
analysis. For this naive approaeh, we estimate the assoeiation between sehool and prineipal 
value-added estimates that eome from the same years and student samples. The results, shown in 
Table 12, indieate that 12 to 14 pereent of the varianee in prineipal value added ean be explained 
by eoneurrent estimates of sehool value added (with implied eorrelation eoeffieients of 0.35 to 
0.37), eompared with R-squared values of no more than 1 pereent under our main approaeh. In 
short, sehool and prineipal value-added estimates based on the same data have a moderate degree 
of eonsisteney with eaeh other. 

Why does the naive approaeh yield far larger relationships between sehool and prineipal 
value added than our main approaeh does? Two faetors likely explain the differenee. First, the 
sampling errors of the two sets of estimates are likely to be eorrelated in the naive approach. 
Because the two estimates are based on the same students, whenever a sehool’s value-added 
estimate is erroneously high or low due to fluctuations in the types of students enrolled, the 
prineipal’s value-added estimate will also likely be erroneously high or low. In our data, error 
varianee constitutes about 5 to 10 pereent of the total varianee in prineipal value-added 
estimates. If most of this error varianee is also shared by the sehool value-added estimates, then 
this would aceount for much of the higher R-squared in the naive approaeh. Seeond, under the 
naive approaeh, the sehool and prineipal value-added estimates that are paired with eaeh other 
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reflect the same transitory prineipal effeets ( 0 ^^ ) and ( ). Thus, the assoeiation between the 
two sets of estimates is inflated by prineipal effeets that are not of key importanee to evaluators. 

5. CONCLUSION 

A eommon method of using student aehievement data to evaluate prineipals is to estimate 
sehool VAMs, whieh eapture the eontributions of entire sehools to student aehievement growth. 
However, sehool value-added estimates refleet a mix of sehool-level faetors that might or might 
not be under prineipals’ eontrol. The usefulness of sehool VAMs as a tool for prineipal 
evaluations depends eritieally on whether differenees in effeetiveness aeross sehools primarily 
refleet differenees in prineipal quality or variation in faetors over whieh prineipals have little 
diseretion. 

In this paper, we have examined the relationship between sehool effeetiveness and the 
persistent level of effeetiveness demonstrated by the sehooTs prineipal. Using student-level 
longitudinal data from Pennsylvania, we measure sehool effeetiveness and prineipal quality on 
the basis of sehool VAMs and prineipal VAMs, respeetively. We identify prineipal quality from 
within-sehool ehanges in student achievement assoeiated with leadership transitions, thereby 
purging the prineipal effeetiveness estimates of all school-speeifle influenees that are invariant 
over time. Moreover, we measure sehool effeetiveness based on distinet years and students from 
those used to measure prineipal effeetiveness. Therefore, any observed assoeiation between the 
sehool and prineipal value-added estimates must refleet effeets on student aehievement that 
prineipals persistently demonstrate. 

We eonelude that there is little evidenee to support the eontention that sehool effeetiveness 
is a useful tool for assessing prineipals’ persistent levels of effeetiveness. Aeross multiple ways 
of speeifying sehool and prineipal value-added models, sehool value added is never a statistieally 
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significant predictor of principal value added. Moreover, the magnitudes of the estimated 
relationships are small. In our main estimates, no more than 7 pereent of any given differenee in 
value added between two sehools refleets persistent differenees in the effeetiveness of their 
eurrent prineipals. 

There are two potential reasons for the weak assoeiation between sehool effeetiveness and 
prineipals’ persistent levels of effeetiveness. One possibility is that more-effeetive prineipals are 
assigned to sehools with other faetors that depress student learning growth, thereby masking the 
prineipals’ eontributions. We eannot eonfirm or rule out the presenee of sueh eompensatory 
assignments — despite our rough, initial attempt by exeluding Pennsylvania’s two largest distriets 
in Table 9 — given that school effectiveness measures eannot disentangle prineipals’ 
eontributions from unobserved, sehool-speeifie faetors. Future researeh eould probe more deeply 
into the ways in whieh prineipals are assigned to sehools. Nevertheless, the presenee or absenee 
of eompensatory assignments does not change our key eonelusion that sehool value added likely 
provides a poor signal of prineipal quality under prevailing praetiees for how prineipals are 
assigned to sehools. A seeond possibility, whieh we believe is more likely, is that sehool value 
added primarily refleets a eombination of influenees on student learning outside of the 
prineipals’ eontrol as well as effeets that prineipals do not eonsistently demonstrate from one 
year to the next. 
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Figure 1. Hypothetical Connected Network 


School A 


School B 


Year 2009 

Year 2010 


Principal 1 

Transfer by 
Principal 1 

Principal 3 

Outcome = 0.3 

Outcome = -0.3 

Principal 2 


Principal 1 

Outcome = 0.1 


Outcome = -0.5 


Note: Each cell of this figure represents a combination of year and school. Within each cell, the figure shows the identity of the 
principal in charge and the mean student outcome (in z-score units) after adjusting for all covariates (except the school and 
principal dummies) in the principal VAM. 
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Table 1. Principal Transitions Across Schools Serving Students in Grades 4 to 8, 2007-2008 
to 2012-2013 




Principals 



Schools 



Total 

Involved in a 
Leadership 
Change 
(number) 

Involved in a 
Leadership 
Change 
(percentage) 

Total 

Change in 
Leadership 
Occurred 
(number) 

Change in 
Leadership 
Occurred 
(percentage) 

Pairs of Adjacent Years 

2007-2008 and 2008-2009 

2,426 

433 

18 

2,274 

233 

10 

2008-2009 and 2009-2010 

2,401 

448 

19 

2,253 

245 

11 

2009-2010 and 2010-2011 

2,419 

496 

21 

2,241 

275 

12 

2010-2011 and 2011-2012 

2,407 

417 

17 

2,282 

233 

10 

2011-2012 and 2012-2013 

2,405 

458 

19 

2,265 

254 

11 

Period for Estimating Principal 

Effectiveness, 2008-2009 to 

2012-2013 

3,269 

1,929 

59 

2,532 

1,034 

41 

All Available Years, 2007-2008 to 

2012-2013 

3,565 

2,378 

67 

2,626 

1,492 

48 
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Table 2, Distribution of Network Sizes 


Size Category: 

Number of Schools in the 
Grade-Specific Network 

Number of Grade-Specific 
Networks with the 
Specified Size 

Total Number of Principal-Grade 
Observations in Networks with 
the Specified Size 

Average Number of 
Principals per Network 

1 

1,629 

3,428 

2.1 

2 

388 

1,115 

2.9 

3 

81 

347 

4.3 

4 

45 

260 

5.8 

5 

11 

83 

7.5 

8 

1 

5 

5 

Any 

2,155 

5,238 

2.4 
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Table 3, Sample Restrictions and Resulting Sample Sizes for the Analysis 


Group 

Number of 
Principal-Grade 
Observations 

Number of Distinct 
Principals 

1. Involved In a leadership change at some time from 2008-2009 to 2012-2013 

5,238 

1,929 

2. Responsible for at least 20 students and in a network with another principal 
responsible for at least 20 students (and in Group 1) 

5,059 

1,881 

3. Led a school that spans the same grade in 2007-2008 (and in Group 2) 

2,001 

802 

4. At least one other principal in the same network meets the sample criteria 
(and in Group 3); final analysis sample 

673 

291 

Distribution of Principal-Grade Observations by Network Size 



In networks with 2 principals in final sample 

458 


In networks with 3 principals in final sample 

156 


In networks with 4 principals in final sample 

44 


In networks with 5 principals in final sample 

15 
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Table 4, Summary Statistics for Principal Characteristics 

Principals who Led a School 
Containing Any of the Grades from 4 
to 8 in Any Year from 2008-2009 to 
2012-2013 


All Principals in 
PA 

Principals 
Involved in 

Transitions 

Principals in Final 
Analysis Sample 

Master's degree (proportion) 

0.742 

0.737 

0.675 


(0.438) 

(0.440) 

(0.469) 

Doctorate degree (proportion) 

0.104 

0.092 

0.107 


(0.305) 

(0.289) 

(0.310) 

Total years of experience in K-12 education (average) 

20.1 

18.9 

22.3 


(9.8) 

(10.3) 

(10.1) 

White (proportion) 

0.869 

0.845 

0.802 


(0.337) 

(0.362) 

(0.399) 

Female (proportion) 

0.489 

0.507 

0.552 


(0.500) 

(0.500) 

(0.498) 

Number of principals 

3,269 

1,929 

291 


Note: Standard deviations are listed in parentheses below the mean for each variable in each sample. 
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Table 5, Summary Statistics for Student-Level Variables 


All Students in 

Students Contributing to Principal VAM 
Estimates, 2008-2009 to 2012-2013 

Students 

PA with 

Contributing to 


Complete 

Attending Schools Attending Schools 

School VAM 

Data, 2007- 

Led by Any Led by Principals 

Estimates in Finai 

2008 to 2012- 

Principals with in Finai Analysis 

Analysis Sample, 

2013 

VAM Estimates Sample 

2007-2008 


Current-Year PSSA Score (z-score units) 


Math 

0.046 

-0.028 

-0.029 

-0.064 


(0.979) 

(0.989) 

(1.008) 

(1.005) 

Reading 

0.042 

-0.026 

-0.020 

-0.064 

Prior-Year PSSA Score (z-score units) 

(0.984) 

(1.004) 

(1.027) 

(1.021) 

Math 

0.042 

-0.012 

-0.014 

-0.064 


(0.978) 

(0.993) 

(1.005) 

(1.014) 

Reading 

0.036 

-0.025 

-0.016 

-0.061 

Student Background (proportions) 

(0.984) 

(1.002) 

(1.022) 

(1.025) 

Female 

0.490 

0.490 

0.490 

0.491 


(0.500) 

(0.500) 

(0.500) 

(0.500) 

Black 

0.139 

0.175 

0.196 

0.231 


(0.346) 

(0.380) 

(0.397) 

(0.421) 

Hispanic 

0.076 

0.088 

0.088 

0.099 


(0.265) 

(0.283) 

(0.284) 

(0.298) 

Eligible for free meals 

0.322 

0.374 

0.368 

0.361 


(0.467) 

(0.484) 

(0.482) 

(0.480) 

Eligible for reduced-price meals 

0.063 

0.059 

0.049 

0.053 


(0.242) 

(0.235) 

(0.216) 

(0.223) 

English language learner 

0.020 

0.024 

0.027 

0.028 


(0.138) 

(0.153) 

(0.162) 

(0.164) 

Special education 

0.157 

0.161 

0.162 

0.163 


(0.364) 

(0.368) 

(0.369) 

(0.369) 

Number of student-year observations 

2,573,228 

748,936 

111,121 

55,072 


Note: Standard deviations are listed in parentheses below the mean for each variable in each sample. The summary statistics in 
the table include students in grades 4 to 8. 

PSSA = Pennsylvania System of School Assessment; VAM = value-added model. 
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Table 6, Extent to Which School Value Added Predicts Principal Value Added in 
Subsequent Years: Main Findings 



Dependent Variable: Principal Value Added, 
2008-2009 to 2012-2013 
(in student z-score units) 


Math 

Reading 

Independent Variables (in student z-score units) 

(1) 

(2) 

School Value Added, 2007-2008 

0.07 

(0.06) 

-0.03 

(0.07) 


Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by True School Value Added 

0.01 

0.00 

Adjusted Standard Deviations Within Networks (expressed in 
student z-score units); 

Principal value added 

0.14 

0.11 

School value added 

0.16 

0.12 

Number of Principal-Grade Observations 

673 

673 

Number of Distinct Principals 

291 

291 


Note: Standard errors clustered by principal are listed in parentheses below each coefficient estimate. No estimates are 
statistically significant at the 5 percent level. All models control for network-hy-grade fixed effects. An adjusted standard 
deviation of value added within networks is the square root of the difference between the total within-network variance of the 
value-added estimates and the average squared standard error of the value-added estimates. 
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Table 7, Extent to Which School Value Added Predicts Principal Value Added in 
Subsequent Years: Alternative VAM Specifications 


Dependent Variable: Principal Value 
Added, 2008-2009 to 2012-2013 

Independent Variables Math Reading 


Two-Step Value-Added Model for Principals and Schools 

Student z-score 

units 

Student z-score 

units 

School Value Added, 2007-2008 (student z-score units) 

0.06 

-0.07 


(0.06) 

(0.09) 

Fraction of the Variance in True Principal Value Added that Can Be Predicted by 
True School Value Added 

0.00 

0.01 

Student Growth Percentiles Model for Principals and Schools 

Student percentile 
points 

Student percentile 
points 

School Value Added, 2007-2008 (student percentile points) 

0.06 

-0.08 


(0.07) 

(0.08) 

Fraction of the Variance in True Principal Value Added that Can Be Predicted by 
True School Value Added 

0.00 

0.01 

No Prior-Year Scores in the Principal Value-Added Model 

Student z-score 

units 

Student z-score 

units 

School Value Added, 2007-2008 (student z-score units) 

0.122 

0.12 


(0.065) 

(0.07) 

Fraction of the Variance in True Principal Value Added that Can Be Predicted by 
True School Value Added 

0.02 

0.02 

Drop First Year After Transition in the Principal Value-Added Model 

Student z-score 

units 

Student z-score 

units 

School Value Added, 2007-2008 (student z-score units) 

0.51 

-0.04 


(0.27) 

(0.11) 

Fraction of the Variance in True Principal Value Added that Can Be Predicted by 

0.04 

0.00 


True School Value Added 


Note: Standard errors clustered by principal are listed in parentheses below each coefficient estimate. No estimates are 
statistically significant at the 5 percent level. All models control for network-hy-grade fixed effects. All models except the final 
model are based on a sample with 673 principal-grade observations and 291 distinct principals; the final model is based on a 
sample with 589 principal-grade observations and 256 distinct principals. 
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Table 8, Extent to Which School Value Added Predicts Principal Value Added in Prior 
Years 

Dependent Variable: Principal Value Added, 2007- 
2008 to 2011-2012 (in student z-score units) 


Independent Variables (in student z-score units) 

Math 

Reading 

School Value Added, 2012-2013 

0.07 

0.14 


(0.08) 

(0.09) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by T rue School Value Added 

0.00 

0.02 

Number of Principal-Grade Observations 

649 

649 

Number of Distinct Principals 

288 

288 


Note: Standard errors clustered by principal are listed in parentheses below each coefficient estimate. No estimates are 
statistically significant at the 5 percent level. All models control for network-hy-grade fixed effects. 
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Table 9, Extent to Which School Value Added Predicts Principal Value Added in 
Subsequent Years: Principal Subgroups 


Dependent Variable: Principal Value Added, 
2008-2009 to 2012-2013 (in student z-score units) 

Independent Variables (in student z-score units) Math Reading 


Elementary-School Principals 

Grades 4-5 

Grades 4-5 

School Value Added, 2007-2008 

0.04 

-0.07 


(0.05) 

(0.09) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by True School Value Added 

0.00 

0.01 

Number of Principal-Grade Observations 

423 

423 

Number of Distinct Principals 

241 

241 

Middle-School Principals 

Grades 6-8 

Grades 6-8 

School Value Added, 2007-2008 

0.13 

0.05 


(0.16) 

(0.09) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by T rue School Value Added 

0.02 

0.00 

Number of Principal-Grade Observations 

250 

250 

Number of Distinct Principals 

150 

150 

Excluding Philadelphia and Pittsburgh 

Grades 4-8 

Grades 4-8 

School Value Added, 2007-2008 

0.05 

0.00 


(0.07) 

(0.08) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by True School Value Added 

0.00 

0.00 

Number of Principal-Grade Observations 

468 

468 

Number of Distinct Principals 

223 

223 


Note: Standard errors clustered by principal are listed in parentheses below each coefficient estimate. No estimates are 
statistically significant at the 5 percent level. All models control for network-hy-grade fixed effects. 
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Table 10, Extent to Which School Value Added Predicts Principal Value Added Measures 
that Account for the Gradual Manifestation of Principals’ Effects 



Dependent Variable: Principal Value Added, 2008- 


2009 to 2012- 

-2013 (in student z-score units) 

Independent Variables (in student z-score units) 

Math 

Reading 

School Value Added, 2007-2008 

-0.10 

-0.35 


(0.16) 

(0.22) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by T rue School Value Added 

0.00 

0.03 

Number of Principal-Grade Observations 

306 

306 

Number of Distinct Principals 

130 

130 


Note: In this analysis, principal value-added estimates are adjusted to correct for the lingering influence of a school’s previous 
principal during the tenure of a subsequent principal. Standard errors clustered by principal are listed in parentheses below each 
coefficient estimate. No estimates are statistically significant at the 5 percent level. All models control for network-by-grade 
fixed effects. 
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Table 11. Extent to Which School Value Added Predicts Principal Value Added in 
Subsequent Years, Using Value-Added Estimates Averaged Across Grades and Subjects 


Dependent Variable: Principal Value Added, 2008-2009 to 
2012-2013 (in student z-score units) 


Independent Variables (in student z-score units) 

Math 

Reading 

Both Subjects 
Combined 

School Value Added, 2007-2008 

0.12 

0.10 

0.14 


(0.07) 

(0.08) 

(0.09) 

Fraction of the Variance in True Principal Value Added that Can 

Be Predicted by True School Value Added 

0.01 

0.01 

0.01 

Number of Principals 

291 

291 

291 


Note: Robust standard errors are in parentheses. No estimates are statistically significant at the 5 percent level. 
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Table 12, Extent to Which School Value Added Predicts Principal Value Added in the 
Same Years: Naive Approach 



Dependent Variable: Principal Value Added, 2008- 
2009 to 2012-2013 (in student z-score units) 


Math 

Reading 

Independent Variables (in student z-score units) 

(1) 

(2) 

School Value Added, 2008-2009 to 2012-2013 

0.24* 

(0.01) 

0.27* 

(0.01) 

Fraction of the Variance in True Principal Value Added that Can Be 
Predicted by True School Value Added 

0.12 

0.14 

Number of Principal-Grade Observations 

5,059 

5,059 

Number of Distinct Principals 

1,881 

1,881 


Note: Standard errors clustered by principal are listed in parentheses below each coefficient estimate. 
* Statistically significant at the 5 percent level. 
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About the Series 

Polieymakers require timely, aeeurate, evidenee-based researeh as soon as it’s available. 
Further, statistieal agencies need information about statistieal teehniques and survey praetiees 
that yield valid and reliable data. To meet these needs, Mathematiea’s working paper series 
offers polieymakers and researehers aecess to our most eurrent work. 

For more information, eontaet Hanley Chiang, senior researeher, at hchiang@mathematiea- 
mpr.eom , Stephen Lipscomb, senior researeher, at slipseomb@mathematiea-mpr.eom, or Brian 
Gill, senior fellow, at b gill@mathematica-mpr . e om . 
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